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Abstract 



Data reduction techniques seek to combine variables that account for patterns of 
variation in observed dependent variables in such a way that a simpler model is available 
for analyns. Factor analysis is a data reduction technique that attempts to model or 
explain a set of variables in terms of thdr associations. To understand why this technique 
yields an accurate analysis, an examination of the mathematical nuxlels underlying the 
procedure is necessary. Execution of factor analysis by SAS and SPSS will then not be a 
"black box". Mathematical models underiying true factor analysis and principal 
components analysis are presented and discussed. An explanation of terms and basic 
differences is pven in terms of the mathematical models. A small, heuristic example to 
illustrate the concepts and matrix algebra procedures involved in the factor analysis data 
reduction technique is included. 
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Data Reduction Techniaues 

Data reduction techniques discussed in this paper are generalized regression-like 
techniques. In regresaon, the decision to keep regressors in the model is based on finding 
the "snudlest-largest" subset of regressors-smallest in the sense that costs associated with 
a large number of variables should be minimized; largest in the sense that enough variables 
need to be retained for rdiable predictions to maximize variance accounted for by the 
variables. There is no fins statistical procedure to find this best subset of variables and 
personal judgment is required, as it is in all statistical analyas (Sd)er, 1977). To illustrate 
the magnitude of the number of possible regressions for a ^ven situation, suppose that 
there are k possible regressors. Since each regressor is either in the equation or not, 
there are 2 * possible such regressions. If k is large, 2 * becomes extremely large, 
quickly, e.g., 2 ^^ = 32 , 768 . 

Methods for Selection of Subsets 

The type of method used to select a regression subset or reduce the data varies 
based on the type of analysis performed. Specific methods discussed in the present paper 
are common (principal) fiictor analysis, principal components analysis and principal 
components factor analysis. Figure 1 shows the relationship between these methods and 
other common methods such as confirmatory factor analysis, exploratory factor analysis 
and maximum-likdihood fiurtor analysis. Principal components factor analysis is a 
combination of the two primary methods, principal components analysis and common 
factor analysis. The other three methods, confirmatory fiunor analysis, exploratory factor 
analysis and maximum-likelihood fiictor analysis are considered types of true factor 
analysis. Maximum-likelihood analysis is frequently employed within both confirmatory 
and exploratory factor analysis. 

Differences between principal components analyus and common fiictor analysis are 
illustrated here during a general explanation of factor analysis and a small, heuristic 
example of principal fhctor analysis is presented. Procedures involving the factor analysis 
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model as a basis have been more widely used and are generally better developed (Velicer 
& Jackson, 1990). 

All generalized regression, data reduction techniques estimate parameters of 
regresMon-like linear models. Indeed, since canonical corrdation analysis subsumes all 
parametric statistic methods (e.g., ANOVA, 1-tests, discriminant analysis) as special cases 
(Knapp, 1978X and rince canonical corrdation analysis invokes a prindpal components 
analyris as part of its mathematics (Thompson, 1984), therefore all parametric methods 
implicitly invoke some kind of factor anal}^ic logic. 

Factor analytic techniques are multivariable, like the reality being modeled, and can 
be understood through the mathematics of matrix algebra. Each seeks a way to combine 
variables that accounts for patterns of variation in the observed dependent variables. This 
yields a simpler model, making further analysis less complicated. The following discussion 
is an introduction to factor analysis with similarities and differences to principal 
components analysis highlighted. 

Principal (Commonl Factor Analysis versus Principal Components Analysis 

Factor analysis, like principal components analysis, regresses standardized 
observed variables on a set of unobserved fetors. Factors are the underlying components 
or dimensions for which estimates of values are obtained. Factor analysis is a statistical 
modd that includes unique, uncorrelated error terms, whereas principal components 
analysis is simply a nuohematical transformation of data (Hamilton, 1992). Factor 
analysis attempts to model each of k standardized observed variables as a linear 
combination of j unobserved fiictors Fj, where j < k, along with an error term for each 
observed variable, Uj^. The factors Fj are conunon factors, since each of the observed 
variables Zj^ is written in terms of these factors. The error term Uj^ is called the unique 

factor, as each observed variable has it's own uniquely determined residual. In general, the 
linear function for each Zj^ has appearance 
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( 1 ) 



h = ^* 1^1 + ^* 2 ^ 2 + -Hy + “Jt • 



The Ifg are equivalent to standardized regression coefficients and are called factor 

loadings. If there is only one factor or if the factors are all orthogonal (uncorrelated), the 
fector loadings are equivalent to the correlation between that factor and the observed 
variable. The matrix notation for this model is 
(ly Z = FL' + U, 

where V represents the transpose of the matrix containing the factor loadings. 

Principal components analysis is a mathematical transformation of the data on the 
k observed variables represented by k principal components or factors. There is no 
unique factor or error term since k principal components will exactly explain all the 
variance of k observed variables (Hamilton, 1992). Principal components analysis is 
simpler mathematically than factor analysis and is a mathematical maximization procedure 
that uses uncorrelated linear functions. The linear function for principal components 
analysis is: 



( 2 ) 






Model (2) is similar to model (1) without the Ui^ term and j = k. The matrix equation 
(2)' will look like (1)' without the U matrix. 



(2)' Z = FL\ 



Principal components factor analysis is a combination of true fiictor analysis and 
principal components analysis in that if less than k foctors explain a large amount of the 
variance of the observed variables, those factors will be used and an error term will be 

introduced to represent the shared residual for each linear combination of factors. The 
difference between the error terms and the tt* error terms for the true factor analysis 
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model is that the Vj^ are not unique, i.e., have nonzero correlation. Principal components 
factor analysis yields a principal components factor model that resembles the true &ctor 
analysis model and is 

( 3 ) h = kl^l + 

where Vj^ = ^ +/jfc - ^ linear function shows that 

these residuals cannot be uncorrelated as in the true factor analysis model (Hamilton, 
1992). For the remainder of this paper, the term principal components analyas wiD refer 
to model (3), since most researchers combine true factor analysis and principal 
components analysis into this model. The controversy regarding the similarities and 
differences between these three techniques is a lengthy issue. This paper will address only 
obvious differences in the representations of the mathematical models. For more extensive 
discussions of Component Analysis versus Common Factor Analysis see the January 
(1990) issue of the Journal of the Society of Multivariate Experimental Psychology, 
Multivariate Behavioral Research . 

Factor analysis centers on attempting to explain a set of observed variables in 
terms of their correlations. Principal components analyris centers on attempting to explain 
a set of observed variables in terms of their variance. The decision as to which of the 
methods to use in an analysis in not clear-cut, especially as these methods produce similar 
results when applied to strongly correlated data. Confusion with terminology and 
computer packages further complicate the choices, as principal components is typically 
listed as an option within a factor analysis computer package. Principal components 
analysis is the default method for extraction in SPSS. If true factor analysis is desired, the 
researcher must indicate another method of extraction, e g., principal axis factoring 
(Pedhazur & Smelkin, 1991). Component analysis will typically involve less computer 
processing time. 




^ f 



"Principal components appeals more to a ‘data analysis’ perspective, whereas 
factor analysis fits better with a ‘model building’ approach," as Hamilton (1992, p. 2S2) 
noted. The goal of both types of anal> ses is to find subsets of variables that are both 
highly correlaied and weakly (or not at all) correlated with each other. Patterns for how 
the variables cluster are determined. 

The goal of data reduction examines output in terms of which factors to retain in 
the model. Each factor will have an associated eigenvalue, denoted X (lambda), to help 
in determining retention. Mathematically, eigenvalues are the roots of the characteristic 
polynomial associated with a ^ven matrix. In the data reduction techniques, eigenvalues 
represent the variances of the orig^ components. In principal components analysis, 
since k components explain k standardized variables, the sum of the eigenvalues will 
equal the number of variables. A component that has an eigenvalue of less than one will 
account for less than a single variables' variation since each standardized variable has 
variance of one. Thus, for principal components analysis, components with A > 1 are 
retained in the model. For true factor analysis, eigenvalues are typically smaller and the 
eigenvalue greater than one criterion is inappropriate and not as useful (Pedhazur & 
Smelkin, 1991). 

An analyst must bear in mind that these are simply recommendations and a large 
amount of subjectivity and thought are required when making these complex decisions. 
Substantive issues must be considered in the specific context of each particular research 
situation. 

Screeplots can be helpful to get an overview of the data. A screeplot is a plot of 
eigenvalues in descending order plotted against the factor number. As the slope of the 
lines between points becomes less steep or smaller, a leveling off becomes apparent. A 
clear break in the slopes, i.e., where they begin to approach a horizontal line, will help a 
researcher determine a useful or natural cut-off for contributing factors. 
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Since true factor analysis centers on correlations between the ori^nal observed 
variables, a correlation matrix R must be obtained. The original observed variables are 
first standardized and placed in a data matrix Z. To form the original correlation matrix 
R, the vectors (columns) of Z must be normalized by taking the product A = 
_j = L-zz- X This scalar multiple takes out the Vw ~ 1 factor introduced by the standard 

deviation in the computation of the z-scores. The correlation matrix is the product of the 
matrix A and it’s transpose A\ Thus, R«A'A and has ones on its major diagonal. 

Factor analysis uses a modified correlation matrix, R*, which has estimates of the 
proportion of a variables explained variance on the major diagonal instead of ones, like the 
ori^nal R. Principal components analysis does not involve this reduction of the variance 
on the diagonal elements. The reduced variance terms are referred to as communality and 
are denoted These values represent the proportion of variance explained by the 

extracted factors. The predictor is taken as the dependent variable and the factors are 
taken as the independent variables. One approach to estimating these values uses the 
coefBcient of detemunation i?*. These values would appear on the major diagonal of 
R* as the initial estimates of h^. Each R^, is determined by regressing the 

/ th standardized observed variable on the remaining standardized observed variables 
Zi,Z2» Matrix alg^ra allows this computation by first forming R~\ 

the inverse of the corrdation matrix R. Take the diagonal entries of R \ invert them 
and subtract them fi’om 1. The resulting values become the initial estimates of the 
commonalities hi^ and are placed on the diagonal of R*. Thus, for a kxk correlation 

matrix R, the initial R* is^venby 



R* 






where I is the A: jc/: identity matrix. 



The initial modified correlation matrix has the form 






Ri 



'll 



ni 

4 



at 

hk 



L'*l '*2 



R 






It follows that R * R* + Q where Q is the diagonal matrix containing the error terms 
u on the major diagonal and zeros everywhere else. 

t% 

The initial factor loadings are derived by extracting the principal eigenvalues and 
forming the corresponding eigenvectors , 1 ^ 7 < /f , each having norm (length) 

J 

one. Set a e} , n = Then each a will be the sum of squares of the nth 

ft ^ ” 

1=1 

entries in each of the eigenvectors and 

Once the initial factor loadings are derived, the new R* is »ven by 

\ 0 • 0> 

0 a, 0 : 

R* =R-I+ ; 0 0 ’ . 

^0 • • • 0 

the new estimates for the communalities on the main diagonal. Iterations are performed 

until the communality estimates are stable. 

Principal components first determines the factor loadings to compute the 

communalities directly. The residual is then found by Principal components 

analysis performs no iterations of any kind and does not begin with estimates. Some 
researchers are uncomfortable with the estimation involved in the true factor analysis 
procedure (Stevens, 1992), along with other objections witii respect to multicollinearities 
(Hawkins, 1973). 



Each factor can now be expressed in terms of the original variables. During the 
process of attempting to combine variables, composites called factor scores are formed. 
These scores derive from the coefficients found by the regression of factors on the 
observed variables. Factor scores are estimates of the frctors and can be found by first 
computing the eigenvalues and corresponding eigenvectors for the last R*. The fiirst 
factor will be a linear combination of the original variables that explains the most 
variance. This will be the factor that has the largest eigenvalue. The second factor Fi 
will have the second largest eigenvalue, and so on. Each frctor can be expressed as 

k 

where = 1. In matrix notation = Ze^, \ <n< j where = 1 and 

M 

ee =0, for all i < m, since each component is uncorrelated with every other 

J ^ 

component (Hamilton, 1992). 

Factor scores replace the orig^ observed score.o and can be analyzed or 
interpreted like any other variable through regression, etc., as a subsequent analysis. If the 
same number of factors are retained from both factor analysis and principal components 
analysis, and when the factors are well-defined, highly similar results are expected from 
the two methods. Velicer and Jackson (1990) report a correlation of .99 or better 
between alternative types of scores in this situation. Even when loadings were low and 
factors were poorly defined with few vaiiables per factor, correlations were .9 or more. 
"Improvements in the quality of the data increased the degree of similarity" (Velicer & 
Jackson, 1990, p. 6). Some of the observed differences between the two methods are 
thought to be the result of overextraction of the number of components by the Kaiser rule; 
the default in many computer programs that employ principal components analysis. 
Maximum-likelihood factor analysis done with large sample sizes can also cause problems 
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with overextraction (Tucker & Lewis, 1973). This test assumes a multivariate normal 
distribution. Zwick and Velicer (1986) provide more on this topic. 

A rotation of the factor loadings is sometimes required to simplify the factor 

structure and make the fiictors more interpretable. If only one factor is retained in the 

model, then rotation is ignored. Mathematically, rotation is a transformation or a routing 

of the axes rq>resented by the factors about the origin that enables variables to load more 

strongly or polarize on a single factor. Orthogonal rotation holds the factor axes 

perpendicular, i.e., keeps them uncorrelated during rotation. The factor loading matrix L, 
having columns ^ . is multiplied by an orthogonal transformation matrix 

M, to obtain a new factor loading matrix L*, where L* » LM. Then a least squares- 
like procedure is invoked. See Gorsuch (1983) for a discussion of various rotations. 

Oblique rotation permits acute angles (correlation) between the factor axes. This 
type of rotation permits further polarization and involves a nonorthogonal matrix 
transformation represented by the matrix equation L** * L*P, where L** is the matrix 
of new factor loadings and P is the nonorthogonal transformation matrix. Oblique 
roution is more complex than orthogonal rotation and somewhat arbitrary, but since the 
loadings are further polarized, it provides easier interpreution. An analyst should use 
different rotation methods and examine the results. If different methods reach the same 
results, conclusions can be considered stable (Hamilton, 1992). The two types of roUtion 
“reflect different frames of reference in viewing phenomena” (Pedhazur & Smelkin, p. 
615). Communalities are not affected by rotation or type of rotation. 

Example 

Suppose a survey of 5 questions concerning treatment by peers was given to 10 
lecturers in a certain department at a large university. The questions are listed in Table 1 . 



Insert Table 1 about here 



The responses are recorded in a raw data matrix X, 
as 0 and a positive response is codrd as 1. 

To 0 1 0 
0 0 11 
0 10 0 
1111 
0 0 1 0 
0 10 1 
110 1 
0 111 
0 0 11 
0 0 11 



where a negative response is coded 

r 

1 

0 

1 

1 

1 

1 

1 

1 

1 



The matrix is entered into a MAPLE session (a computer algd>ra system), after loading 
the linear algebra package and setting the digits to 6. The statistics package is also 
loaded. The exact MAPLE commands for this example are listed in appendix A. 

A matrix of z-scores needs to be computed. The mean and standard deviation of 
each column of X is listed in Table 2. Exact arithmetic was used throughout all 
computations and then converted to 6 decimal places as each of the matrices needed to 
be examine . Only the decimal representations of the matrices are given in this paper, but 
the MAPLE conunands for both the exact arithmetic matrices and the decimal 
representations are listed in appendix A. 



Insert Table 2 about here 



The matrix Z of standardized variables is 



■-.474342 


-.948684 


.621059 


-1.44914 


.316228’ 


-.474342 


-.948684 


.621059 


.621059 


.316228 


-.474342 


.948684 


-1.44914 


-1.44914 


-2.84605 


1.89737 


.948684 


.621059 


.621059 


.316228 


-.474342 


-.948684 


.621059 


-1.44914 


.316228 


-.474342 


.948684 


-1.44914 


.621059 


.316228 


1.89737 


.948684 


-1.44914 


.621059 


.316228 


-.474342 


.948684 


.621059 


.621059 


.316228 


-.474342 


-.948684 


.621059 


.621059 


.316228 


-.474342 


-.948684 


.621059 


.621059 


.316228. 




to 
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Each column in Z must be normalized (^ven length 1) so that the correlation 



matrix R can be computed. This normalized data is listed in a matrix A. 

[-.158114 -.316228 .207020 -.483046 . 105409 

-.158114 -.316228 .207020 .207020 .105409 

-.158114 .316228 -.483046 -.483046 -.948684 

.632456 .316228 



A- 



207020 .207020 .105409 

-.158114 -.316228 .207020 -.483046 .105409 

-.158114 . 316228 -.483046 .207020 .105409 

.632456 .316228 -.483046 .207020 .105409 

-.158114 .316228 .207020 .207020 .105409 

- 158114 -.316228 .207020 .207020 .105409 

- 158114 -.316228 .207020 .207020 . 105409 J 



To obtain R, find A' 






' 1.00000 

.500000 

-.218219 

.327326 

.166666 



(A transpose) and form the matrix product R 
.500000 -.218219 .327326 .166666' 

1.00000 -.654658 .218221 -.333334 

-.654658 .999999 .0476189 .509178 

.218221 .0476189 .999998 .509177 

-.333334 .509178 . 509177 1.00000. 






To obtjun the initial estimates of the communalities or shared variances, 



R~^ (the 



inverse of R) is computed. 



' 1.60000 


-1. 


0 


0 


-.600000' 


-1. 


2.70833 


1.14565 


-.763767 


.875000 


0 


1.14565 


2.10000 


0 


-.687389 


0 


-.763767 


0 


1.75000 


-1.14565 


-.600000 


.875000 


-.687389 


-1.14565 2.32500 J 



The diagonal entries are inverted and subtracted fi^om 1 . The resulting values become the 



new entries along the diagonal of R*. 



RstOTr^ 



'.375000 

.500000 

-.218219 

.327326 

.166666 



.500000 

.630769 

-.654658 

.218221 

-.333334 



-.218219 

-.654658 

.523809 

.0476189 

.509178 



.327326 .166666' 

.218221 -.333334 

.0476189 .509178 

.428569 .509177 

.509177 .569892. 



To start the iterative process, the eigenvalues of R* are found. 

-.237107, -.138579, -.427933 10*^, 1.21591, 1.68782 



Two of the eigenvalues are positive and three are negative. The eigenvectors 
corresponding to the positive eigenvalues are found. MAPLE will return these 
eigenvectors already scaled to have norm 1 . 

•i =[.477755 .238664 .0775400 .630349 . 558054 ] 

*2 ;= [-.282451 -.629440 .590471 .009117 . 418665 ] 



The next R* is formed by taking the sums of squares of the corresponding entries 
in these two eigenvectors and placing them on the diagonal as new estimates of the 



conununalities. 



twwBstar = 



'.308029 .500000 

.500000 .453156 

-.218219 -.654658 
.327326 .218221 

. 166666 -.333334 



-.218219 

-.654658 

.354667 

.0476189 

.509178 



.327326 

.218221 

.0476189 

.397421 

.509177 



.166666 

-333334 

.509178 

.509177 

. 486704 . 



This process will be repeated until the R* matrix converges. Convergence can be 
checked by looking at the difference in the last two consecutive R* matrices. 

The next iteration yields the eigenvalues 

-. 352971 , -. 275860 . -. 0627888 . 1 . 15194 , 1.53967 



and new R* matrix 



'.316082 


.500000 


-.218219 


.327326 


. 166666 " 


.500000 


.435714 


-.654658 


.218221 


-.333334 


-.218219 


-.654658 


.344568 


.0476189 


.509178 


.327326 


.218221 


.0476189 


.417134 


.509177 


..166666 


-.333334 


.509178 


.509177 


. 436483 . 



Four more iterations are ^ven 



'.317330 


.500000 


-.218219 


.327326 


. 166666 " 


.500000 


.431027 


-.654658 


.218221 


-.333334 


-.218219 


-.654658 


.343960 


.0476189 


.509178 


.327326 


.218221 


.0476189 


.423766 


.509177 


. 166666 


-.333334 


.509178 


.509177 


. 483915 . 


'.317135 


.500000 


-.218219 


.327326 


. 166666 " 


.500000 


.429600 


-.654658 


.218221 


-333334 


-.218219 


-.654658 


.344579 


.0476189 


.509178 


.327326 


.218221 


.0476189 


.426671 


.509177 


..166666 


-333334 


.509178 


.509177 


. 482011 . 



■.31€738 
.500000 
-.218219 
.327326 
.166666 
■.316402 
.500000 
-.218219 
.327326 
. 166666 



.500000 

.429128 

-.654658 

.218221 

-333334 

.500000 

.428960 

-.654658 

.218221 

-333334 



-.218219 

-.654658 

.345089 

.0476189 

.509178 

-.218219 

-.654658 

.345384 

.0476189 

.509178 



.327326 

.218221 

.0476189 

.428192 

.509177 

.327326 

.218221 

.0476189 

.429057 

.509177 



.166666 

-.333334 

.509178 

.509177 

.480855. 

.166666' 

-.333334 

.509178 

.509177 

.480177. 



and then a check for convergence: 

■-.000336 0 0 0 O' 

0 -.000168 0 0 0 

0 0 .000295 0 0 

000 .000865 0 

0 0 0 0 -.000678. 



Two more iterations 





.316173 


.500000 


-.218219 


.327326 


.166666' 




.500000 


.428907 


-.654658 


218221 


-.333334 




-.218219 


-.654658 


.345551 


.0476189 


.509178 




.327326 


.218221 


.0476189 


.429576 


.509177 




.166666 


-.333334 


.509178 


.509177 


.479794. 




.316016 


.500000 


-.218219 


.327326 


.166666' 




.500000 


.428888 


-.654658 


.218221 


-.333334 




-.218219 


-.654658 


.345646 


.0476189 


.509178 




.327326 


.218221 


.0476189 


.429888 


.509177 




..166666 


-333334 


.509178 


.509177 


.479573. 


check for convergence: 










-.000157 


0 


0 


0 


0 ' 




0 


-.000019 


0 


0 


0 




0 


0 


.000095 


0 


0 




0 


0 


0 


.000312 


0 




0 


0 


0 


0 


-.000221. 


Two more iterations 












■.315913 


.500000 


-.218219 


.327326 


.166666' 




.500000 


.428879 


-.654658 


.218221 


-.333334 




-.218219 


-.654658 


.345685 


.0476189 


.509178 




.327326 


.218221 


.0476189 


.430062 


.509177 




.166666 


-.333334 


.509178 


.509177 


.479427. 



- • 



’.315847 


.500000 


-.218219 


.327326 


.166666' 


.500000 


.428876 


-.654658 


.218221 


-.333334 


-.218219 


-.654658 


.345713 


.0476189 


.509178 


.327326 


.218221 


.0476189 


.430176 


.509177 


.166666 


-.333334 


.509178 


.509177 


.479350. 



and another check for convergence: 

■-.000066 0 0 0 

0 -. 3 10’5 0 0 

0 0 .000028 0 

000 .000114 

0 0 0 0 

That is close enough. 

To obtain estimates of the principal factors, find the eigenvalues and eigenvectors 
of the last R*. The principal factors are the product of Z and these eigenvectors, and 
are ^ven by 



0 

0 
0 
0 

-.000077 J 



:=[-!. 19815 .158920 -2.54522 1.74197 -1.19815 .490422 1.63098 

.601408 .158920 .158920] 

i ?2:=[ 1.18905 1.23712 -2.58979 -.613639 1.18905 -1.13620 -1.82583 

.0759950 1.23712 1.23712] 

A screeplot of the eigenvalues fi’om the last iteration is ^ven in figure 2. 



Insert Figure 2 about here 



COBClUtiOD 

Factor analysis is a regression-like data reduction technique that involves a 
generalized least squares procedure. As with all data reduction techniques, factor analysis 
seeks to combine variables into common, underlying factors that can be further analyzed. 
Matrix algebra helps illustrate the dynamic involved in the procedure. A computer algebra 
system such as MAPLE makes the matrix algd)ra bearable. An examination of factor 
analysis in this manner makes clear the processes that SAS and SPSS execute and do not 
allow them to be a black box. 
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Table 1 



Five questions asked to lecturers 



XI: Do you feel you have input in departmental decisions? 

X2: Do you fed the professional fiumlty consider you as an integral part of the program? 
X3: Are there procedures or events that cause you to feel unnecessarily separated from 
the rest of the faculty? 

X4: Does your Department Head or a designated supervisor discuss your evaluation with 
you each year? 

X5: Are you interested in long term employment at this university? 



Table! 



Mean and standard deviation of the columns of X 



M?an 



Standard Deviation 



XI 



.2 



.421636 



X2 



.5 



.527048 



X3 



.7 



.483046 



X4 



.7 



.483046 



X5 



.9 



.316228 



f 



Figure 1 

The relationship among some data reduction techniques 



PRINCIPAL COMPONENTS ANALYSIS COMMON FACTOR ANALYSIS 





PRINCIPAL COMPONENTS FACTOR ANALYSIS 



O 

ERIC 



Figure 2 

Screeplot of eigenvalues 
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Appendix A 

Maple Commands - Note: Calls to eigjnfo and f below are session dependent. 

> Digits:-6: 

> with(linalg): 

> with(8tats); 

> with(describe); 

> with(transform); 

> X:—matf1x(1 0,5, [0,0,1 ,0,1 ,0,0,1 ,1 ,1 ,0,1 ,0,0,0,1 ,1 ,1 ,1 ,1 ,0,0,1 ,0,1 ,0,1 ,0,1 ,1 ,1 ,1 ,0,1 ,1 ,0,1 , 

> barlist:=NULL: for i from 1 to S do bariist:=barlist,mean(convert(col(X,i),list)): od: 

> Xbar:=vector(5,[barii8t]); 

> devli8t:=NULL: for i from 1 to S do 

> devli8t:=devli8t,8tandarddeviation[1](convert(col(X,i),li8t)): od: 

> Xdev:=vector(5,[devli8t]); 

> for i from 1 to 5 do evaif(XdevO]); od; 

> matli8t:=NULL: for i from 1 to 5 do 

> matli8t:=matIi8t,8tandard8Core[1](convert(coI(X,l),li8t)): od: 

> Zstar:=^ran8po8e(matrix([matlist])); 

> Z:=map(evalf,Z8tar); 

> A8tar:=evalm(Zstai^1/3); 

> A:=map(evalf,Astar); 

> R1 :=multiply(transpose(Astar),Astar); 

> R:=multiply(transpose(A),A); 

> R1inv:sinver8e<R1); 

> Rinv:=map(evalf,R1inv); 

> diaglist:=NULL: fpr i from 1 to 5 do diaglist:Bdiagli8t,1/Rinv[i,q: od: 

> RstanBevalm(R-dlag(dlaglist)); 

> eigenvals(Rttar); 

> eig info:seigenvects(Rstar); 

> el :^ig Jnfo[1][3][1]; t2:«eig.lnfo[41[3][1]; 

> comlist:=NULL: for i from 1 to 5 do comlist:BComlist,e1[i]^2-io2[i]^2: od: 

> newRstar:*evalm(R<iiag(1 ,1,1,1 ,1 )<^iag(comlist)); 

> eigenvals(newRstar); 

> eigJnfo:seigenvect8(newR8tar); 

> f:-proc(m,ii) 

> local comliat, I, v1, v2: 

> global newRstar: 

> v1:=elgJnfo[m][3][1]: v2:seelg„lnfo[n][3][1]: 

> comlist:=NULL: 

> for i from 1 to 5 do 

> comlist:=comllst,v1[l]^2+v2[ll^2: 



> newRstar:=evalm(R-dlag(1,1,1,1f1)+diag{comlist)); 

> end; 

> f(3.5); 

> eig_info:=eigenvects(newRstar); 

> f(3.5); 

> eig_info:a:elgenvects(newRstar); 

> f(1.3); 

> eig_info:=eigenvects(newRstar); 

> f(1,2); 

> eig info:seigenvects(newRstar); 

> f(1,2); 

> evalmT-”""); 

> eig_info:=eigenvects(newRstar); 

> f(2.3); 

> eig_info:=eigenvects(newRstar); 

> f(3,5); 

> evalmT-"""); 

> eig_lnfo:=eigenvects(newRstar); 

> f(2,3); 

> eig info:seigenvects(newRstar); 

> f(i,3); 

> evalmC*-’*""); 

> eig info:eeigenvects(newRstar); 

> e1:seigJnfo[1H3H1]; e2:«eig.info[3H31[1]; 

> F1:smultipiy(Z,e1); F2:«imiHlply(Z,«2); 

> order1ist:«L3,1,4,5,2]: plist:>NULL: for I from 1 to 6 do 

> plist:splist,i,eigLinfo[orderiist[i]][1]: od: plist; 

> piot([piist],stylesiine); 



